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Abstract. In this paper, we present a method that enables solving in 

parallel the Euler-Lagrange system associated with the optimal control 

2 of a parabolic equation. Our approach is based on an iterative update 

M of a sequence of intermediate targets that gives rise to independent sub- 

I I problems that can be solved in parallel. This method can be coupled 

with the parareal in time algorithm. Numerical experiments show the 

"^ efficiency of our method. 
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>• 1. Introduction 

S^ In the last decade, parallelism across the time [31, based on the decomposition 
J-H of the time domain has been exploited to accelerate the simulation of systems 
governed by time dependent partial differential equations ^. Among others, 
the parareal algorithm f5: or multi-shooting schemes [2] have shown excellent 
results. In the framework of optimal control, this approach has been used to 
control parabolic systems [3 |H] • 

In this paper, we introduce a new approach to tackle such problems. 
The strategy we follow is based on the concept of target trajectory that 
has been introduced in the case of hyperbolic systems in [£] . Because of the 
irreversibility of parabolic equations, a new definition of this trajectory is 
considered. It enables us to define at each end point of the time sub-domains 
relevant initial conditions and intermediate targets, so that the initial problem 
is split up into independent optimization problems. 

The paper is organized as follows: the optimal control problem is intro- 
duced in Section [2] and the parallelization setting is described in Section [3] 
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The properties of the cost functionals involved in the control problem are 
studied in Section l4J The general structure of our algorithm is given in Sec- 
tion [5] and its convergences is proven in Section [6] In Section [7) we propose a 
fully parallelized version of our algorithm. Some numerical tests showing the 
efRciency of our approach are presented in Section [8j 

In the sequel, we consider the optimal control problem associated with 
the heat equation on a compact set Q and a time interval [0, T], with T > 0. 
We denote by ||.||o the space norm associated with L^(ri), and by ||.||o^ the 
L^-norm corresponding to a sub-domain flc C O. Also, we use the notations 
||.||„ (resp. ||.||t,„) and (.,.)u (resp. (.,.)^^) to represent the norm and the 
scalar product of the Hilbert space L^(0,T; J7c) (resp. L'^{I';fic)), with /' a 
sub- interval of [0,T] ). Given a function y defined on the time interval [0,T], 
we denote by y|/' the restriction of y to I' . 



2. Optimal control problem 

Given a > 0, consider the optimal control problem defined by: 



min J(v) 

«eL2([o,T];L2(n,)) 



with 



where ytarget is a given state in L'^{fl). The state y evolves from j/o on [0,r] 
according to 

dty — vAy — Bv. 

In this equation, A denotes the Laplace operator, v is the control term, 
applied on fie and B is the natural injection from f^c into fi. We assume 
Dirichlet conditions for y on the boundary of 51. 
The corresponding optimality system reads as 



dty — vAy — Bv 
2/(0) = 2/0, 


on [0, T]xn 


(2.1) 


dtp + i^Ap = 
P{T) - y{T) 


on [0, T]xn 

y tar get 1 


(2.2) 


av + B*p = 


0, 


(2.3) 



where B* is the adjoint operator of B. 

Note that for any a > 0, the functional J is continuous, a-convex in 



L^{Vtc) and consequently the system (2.1 2.3 1 has a unique solution by v* . 



We denote by y* , p* the associated state and adjoint state. 
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3. Time parallelization setting 

In this section, we describe the relevant setting for a time parallelized reso- 
lution of the optimality system. 
Consider iV > 1 and a subdivision of [0, T] of the form: 

[0,T]=U,t-oi/„, 

with /„ = [tmtn+i], to = < ti < ... < tis[-i < tjq = T. For the sake 
on simplicity, we assume here that the subdivision is uniform, i.e. for n = 
0, . . . , TV - 1 we assume that t„+i - i„ = T/N ; we denote AT = T/N. Given 
a control v and its corresponding state y and adjoint state p, we define the 
target trajectory by: 

X = y~P on[0,r]xfl. (3.1) 

The trajectory x is not governed by a partial differential equation, but reaches 



x{T) — ytarget ^t time T from (2.2,), hence its denomination 



For n = 0, . . . ; iV — 1, consider the sub-problems 

min J„(w„), (3.2) 

with 

^nK) = ^ll2/n(t„+l)-x(i«+l)||o+f y hnifyfadi, (3.3) 

where the function y„ is defined by 



dtyn - vl^y-a = Sw„ on /„ X fi 

Vn{tn) = y{tn)- 



(3.4) 



Recall that this optimal control problem is parameterized by v (and y and 
p) through the local target x(t„-|_i), we note that this sub-problem has the 
same structure as the original one, and is also strictly convex. The optimality 



system associated with this optimization problem is given by (3.4) and the 
equations 



dtVn + vIS.-pn = on /„ X r2 

P«(in+l) = V{tn+\) - x{K+\) 



(3.5) 
avn + 6*p„ = 0, (3.6) 



we denote by v* its solution. 



4. Some properties of J and J„ 

The introduction of the target trajectory in the last section is motivated by 
the following result. 
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Lemma 1. Denote by x* the target trajectory defined by (3.1 1 with y = y* and 



p = p* and by y^jp'^, w* the solutions of ii3^- 3.6) with y = y* and x = X* ■ 
One has: 

Proof. Thanks to the uniqueness of the solution of the sub-problem, it is 
enough to show that w^ satisfies the optimality system (3.4 3. 6|. 



First, note that y^j obviously satisfies (3.4) with w„ = W|*^ . it directly follows 
from the definition of x* (see (3.1)), that: 

P*itn+l) = y*{tn + l) - X*{tn+l), 



SO that p'^j satisfies ( |3.5[ ). Finally, Equation (3.6) is a consequence of (2.3). 
The result follows. D 

Let HJ denote the hessian operator associated with J; there exists a 
strong connection between the hessian operators HJ and HJn of J and J„, 
as indicated in the next lemma. 

Lemma 2. The hessian operator HJn coincides with the restriction of HJ to 
controls whose time supports are included in [t]^_i,T]. 

Proof. First note that J is quadratic so that HJ is a constant operator. Given 
an increase dv E L^([0,T]; L^(i7c)), we have: 

{HJ{Sv),dv), = \\SyiT)\\l + a f \\Sv{t)\\ldt, 



where Sy is the solution of 

dtSy - vMy = B5v on [0, T]xVL 
5y{Q) - 0. 

Given 1 < n < A^, consider now an increase 5vn G L^{In', L'^i^c))- One finds 
in the same way that: 



(4.1) 



{HJn{Sv„),SVn)v^ = ll'52/«(tn+l)|ln +" 



t„+l 
2 , „, / llJt,, /+Ml2 



\SVn{t)rndt, 



(4.2) 



where Sy„ is the solution of 

f dtSyn-vASyn = BSvn on [t„,t„+i] x ^i 
1 Synitn) = 0. 

Suppose now that 5v = on [0,iAr_i], it is a simple matter to check that 
6y = over [0,i7v-i]- The restriction of Sy on the interval [ijv-i,?^] thus 
satisfies Sy{tN^i) ~ and is consequently (up to a time translation) the 
solution of (|t!2l). D 



We end this section with an estimate on these hessian operators. 
Lemma 3. Given 6v e L'^{[0,T];L'^{flc)), one has: 



a 



T fT 

2 ^^- ^ lu TIX„.\ x„.\ ^ a i ll.>;„,/+Ml2 



\\&v(t)\\iidt < {HJi5v),6v)^, < (3 / \\5v{t)\\t,Jt, (4.3) 

Jo 

where (3 — Q! + C/v2, with C the Poincare's constant associated with L^(i7). 



Parareal intermediate targets methods for optimal control problem 5 

The proof of this result is standard and given in Appendix for the sake 
of completeness. Because of Lemma [5] the hessian operator HJn also satis- 



fies (4.31 



5. Algorithm 

We are now in a position to propose a time parallelized procedure to solve 



(2.1 2.3[ ). In what follows we describe the principal steps of a parallel al- 
gorithm named "sitpoc" ( Serial mtermediate Targets for Parallel Optimal 
Control). 

Algorithm 4 (siTPOC). Consider an initial control v^ and suppose that, at 
step k one knows v^ . The computation of v^^^ is achieved as follows: 

I. Compute y^ , p^ and the associated target trajectory x^ according to 



(2.1), (2.2) and (3.1) respectively. 



II. Solve approximately the N sub-problems (3.2) in parallel. For n = 
0, ... ,7V — 1, denote by 'D^^^ the corresponding solutions and by f;'^+^ 
the concatenation of (t'^^^)n=o,...,w-i- 
III. Define v^^^ by v^+^ = (1 - 9'')v^ -I- 9^v^+^, where 9^ is defined to 
minimize J((l - 9^)v'' -I- 9^v^+^). 

Note that we do not explain in detail here the optimization step (Step 
[ll| and rather present a general structure of our algorithm. Because of the 
strictly convex setting, some steps of, e.g., a gradient method or a small 
number of conjugate gradient method step can be used. 



6. Convergence 

The convergence of Algorithm [4] can be guaranteed under some assumptions. 
In what follows, we denote by V J the gradient of J. 

Theorem 6.1. Suppose that the sequence {v^)ke^ defined in AlgorithmiA sat- 
isfies, for all k > 0: 

J{v^) ^ Jiv°°), (6.1) 

(VJ(«'=),i;'=+i-i;'^)„<0, (6.2) 

and 

\\yJiv')h<v\\v''^'~v''\\v, (6.3) 

for a given rj > 0. Then (ti'^)fcgN converges linearly with a rate (1 %-) S 

[0, 1) to the solution of pJp^ 



))- 



Note that in the case (6.1 ) is not satisfied, there exists ko £ N such that 



V ° = v°° and the optimum is reached in a finite number of steps. 



6 



Y. Maday, M.-K. Riahi and J. Salomon 



Proof. Define the shifted functional 

J{v) = Jiv) - JK), 
and note that because of the definition of v* , one has 

Jiv)=^-{HJiv-v*),v-v*)^<^\\v~v*\\l. 

Since J is quadratic, for any v G L'^{Q,c) 

VJ{v) =HJ{v-v*), 
and consequently 

{VJ{v),v - V*), = {HJ{v ~ v*),v - v*)y > a\\v - v*\\l, 
so that 

\\v-v*\\,< -\\VJ{v)\\,. 



(6.4) 



Combining (6.4) and (6.5), one gets 



yveL^iQc), yJiv)<j\\VJiv)\\,, 



(6.5) 



(6.6) 



with 7 — ^ 



2v^- 



On the other hand, the variations in the functional between two iterations of 
our algorithm reads as 



> {VJiv''),v'-v''+'), + -\\v'~v' 



k _„fc+l||2 



Combining this last inequality with (6.2 1, one finds that 



fe „,fe+l||2 



Since J{v'') - J{v^+^) = J{v^) - J{v^+'^) > 0, we have 

1 



(6.7) 



Jiv'')-Jj{v''+^) > 



2Jj{v^) 



{J{v')-J{v''+')) 



> 



> 



> 



-Wv'-v'^^X 



4^J(w'=) 
a 

47||VJ(w)|| 



-wv'-v'+r. 



477/1 



fe „,fe+l||2 






> c\\v''~v''+^\ 



(6.8) 

(6.9) 
(6.10) 
(6.11) 



where c = ^j — ^- Indeed (6.8 1 follows from (6.7), (6.9 1 from (6.6) and 



(6.101 from (6.3). It follows from the monotonic convergence of V J{v^) that 
the sequence w*^ is Cauchy, thus its convergence. 
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Let us now study the convergence rate. Define r'' — X^iJ^ ||f^^^ — w''||u. 
Summing (6.111 between k and +00, we obtain: 



J{v'') > cr'' 



Using again (6.6) and (6.3), one finds that: 

?77(r'' -r''+^) > cr'' . 



2a 



Note that this incquahty miphes that 1 ^ > 0. Define C :— 



(6.12) 
we 



have < C < 1. Because of (6.12) 



(1 - cy'^r'' > (1 - c)-('=+iV'^+\ 

and the result follows. 



D 



We now give an example where hypothesis (6.2-6.3) are satisfied. 
Corollary 6.2. Assume that Step [^ of Algorithm R] is achieved using only 
one step of a locally optimal step gradient method and that at step k, the 



algorithm is initialized with w„ •= ^|/ j then (6.2-6.3) are satisfied hence the 



algorithm converges to the solution of (2.1-2.3T. 

Proof. Because of the assumptions, the optimization step (Step.M reads: 



,-,fe+i _ „fe 



v: 



<-pnVJ„K). 



Since the functionals J„ are quadratic, one has: 



pI 



(i/J„(VJ„(^^)),VJ„(i;^)),„„' 
A first consequence of these equalities is that: 



r,fc+i 



{^Uvl),v. 
Moreover Lemmas [2] and |3] imply: 



-p!;I|vj„(«:;)||1 < 0. 



1 . k 



1 
< -. 

a 



(6.13) 



(6.14) 



One can also obtain similar estimates of 9^ . In this view, note first that since 
the only iteration which is considered uses as directions of descent VJ„(wfJ = 
VJ{v%i^^. Then: 



(VJ(w'^),w 



~k+i 



{HJ{i^+^ -v''),u 



71a!~pJ. qjfi\ 



N 



{HJ{v'' 



,-^fe+i 



Z^ r,k II '^n 
n=l '^" 



..k\\2 



Using (6.14), one deduces: 



< 



< 



/3 



a 



(6.15) 
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This preliminary results will now be used to prove the theorem. The proof 



of (6.2), follows from (6.13) 



{wj{v''),v^+^-v^), = e^{\/j{v^), 



N 



0'=^(VJ„(^;^),z)^l-^;^)„„<O. 



This last estimate is a consequence of (6.13). It remains to prove (6.3). We 
have: 



k'=+i-t;'=| 



= e^\\i^+^-v^\ 



< 



and the result follows. 



N 



E||f,fe+1 _ ,,fe||2 
\ n=l 



N 



\ n=l 






P^ 



|VJ(«'=)IU, 



D 



7. Parareal acceleration 

The method we have presented with algorithm [4| req uires in Step |T] two se- 
quential resolutions of the evolution Equation (2.1) on the whole interval 



[0,T], which does not fit with the parallel setting. In this section, we make 
use of the parareal algorithm to parallelize the corresponding computations. 

7.1. Setting 

Let us first recall the main features of the parareal algorithm. We consider the 
example of Equation ( |2.1[ ). In order to solve in parallel an evolution equation, 
for the parareal scheme @] we introduce intermediate initial conditions at 
times (tn)n=Q,...,N-i that are updated iteratively. Suppose that these values 
(A^) are known at step k. Denote by tJ„(A„) and J>i(A„) coarse and fine 
solutions of (3.4) at time t„+i with A„ as initial value. The update is done 



according to the following iteration: 

We use this procedure in Step |T] of Algorithm |4j The idea we follow consists in 
merging the two procedures, i.e. doing one parareal iteration at each iteration 
of our algorithm. 
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7.2. Algorithm 

We now give details on the resulting procedure. Since the evolution equa- 
tions depend on the control, we replace the notations tJ„(A„) and J>i(A„) by 
Gn{Xn, Vn) and J>i(A„, Vji) respectively. As we need backward solvers to com- 



pute p, see (2.2), we also introduce Gni^J■n+l) and TnifJ-n+i) to denote coarse 



and fine solutions of (3.5 1 at time t„ with /x^+i as "initial" value (given at 
time t„+i). Note that these bakward solvers J-"„ (resp: Gn) do not depend on 
the control. 

We describe in the following the principal steps of an enhanced ver- 
sion of the SITPOC algorithm which we give the name "pitpoc" as Parareal 
mtermediate Targets for Optimal Control. 

Algorithm 5 (pitpoc). Denote by v^ = vK . Consider a control (w")n=o....,w^i, 
initial values {X^)n=o,....N (through forward scheme X^+i — ^"(^n^'^n)): f^^sl 
values {fJ-n)n=i,...,N (through backward scheme /i° — Gn{^J■n+l)■ 
Suppose that, at step k one knows v'^, {Xn)n=o....,N and {l^n)n=i,....,N- The 
computation of w'^^^, (Aj;+^)„=o,...,JV and (/iJi^^)n=i,...,Ar is achieved as fol- 
lows: 

I. Build the target trajectory {Xn)n=i,....N according to a definition similar 



to ( |3.1[ ): 

^*; _ \*; _ ,,fc 

An ~ ^n r-n- 



II. Solve approximately the N sub-problems (3.2) in parallel. For n — 
0, . . . ,N — 1, denote by v^'^^ the corresponding solutions. 

III. Define v''^^ as the concatenation of the sequence ('5^^^)„=o,...,Af-i- 

IV. Compute (A^+^)„=o,...,Ar, {fJ-n'^^)n=i....,N by: 

^^n^' = Gn{^i':X\) + Tr,{^^i+^)-Gn{^4.+^). 

V. Define v^+^ and {\i+^)n=Q....,N 

yk+i ^ (l-6l'^)i,'=+6l'=w'=+i, 

where 9^ is defined to minimize 
VI. k = k + 1 and return to I. 



8. Numerical Results 

In this section, we test the efficiency of our method and show how robust the 
approach is. We consider two independent parts describing numerical results 
of the selected algorithm. 
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8.1. Setting 

We consider a 2D example, where Q = [0, 1] x [0, 1] and Oc = [|, |] x [|, |]- 
The parameters related to our control problem are T = 6.4, a =^ 10^^ and 
ly — 10~^. The time interval is discretized using a uniform step St = 10^^, 
and an Implicit-Euler solver is used to approximate the solution of Equa- 
tions (2.1 2.2). For the space discretization, we use 



finite elements. Our 
implementation makes use of the freeware FreeFem [^ and the parallelization 
is achieved thanks to the Message Passing Interface library. The independent 
optimization procedures required in Step In] are simply carried out using one 
step of an optimal gradient method. 

8.2. Influence of the number of sub-intervals 

In this section, Step II of Algorithm |4] and Algorithm [5] are achieved by using 
one step of an optimal step gradient method. We first test our algorithm 
by varying the number of sub-intervals. The evolution of the cost functional 
values are plotted with respect to the number of iteration (Figure IT]), the 
number of matrix multiplication (Figure^ and the number of wall-clock time 
of computation (Figure Isl) . We first note that Algorithm 4 actually acts as 





Figure 1 . Decaying cost functional values according to the 
iterations count with respect to SITPOC algorithm (left) and 
PITPOC algorithm (right). 



a preconditioner, since it improves the convergence rate of the optimization 
process. The introduceion of the intermediates targets allows to accelerate 
the decrease of the functional values, as shown in Figure [2] (left) . Note that 
this property holds mostly for small numbers of sub-intervals, and disapears 
when dealing with large subdivisions. This feature is lost when considering 
Algorithm [5I whose convergence does not significantly depend on the number 
of sub-intervals that is considered, see Figure [2] (right). 
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NLnber of operiatian of nultiplications 



NLnber of operiatian of nultiplications 



Figure 2 . Decaying cost functional values according to the 
multiplication operations count with respect to SITPOC al- 
gorithm (left) and pitpoc algorithm (right). 

On the contrary, Algorithm [s] achieves a good acceleration when con- 
sidering the number of mutliplications involved in the computations. The 
corresponding results are shown in Figure [2] where the parallel operations 
have been counted only once. We see that Algorithm is close to the full effi- 
ciency, since the number of multiplications required to obtain a given value 
for the cost functional is roughly proportional to jj. 

We finally consider the wall-clock time required to carry out our al- 
gorithms. As the main part of the operations involved in the computation 
consists in matrix multiplications, the results we present in Figure 3 are close 
to the ones of Figure 2. 

8.3. Influence of the number of steps in the optimization method 

We now vary the number of steps of the gradient method used in Step II of 
our algorithm. The results are presented in Figure 4. Subdivisions of iV = 4 
and A'^ = 16 intervals are considered. In both cases, we see that an increase 
in the number of gradient steps improves the preconditionning feature of our 
algorithm. However, we also observe that this strategy saturates for large 
numbers of gradient steps which probably reveals that the sub-problems con- 
sidered in Step II are practically solved after 5 sub-iterations. 

More results can be found in [ID]. 



Appendix 

For the sake of completeness, we recall here the proof of Lemma [3j 
Because of (4.1 1 and thanks to Young's inequality, one has for all t E [Q,T] 
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Figure 3. Decaying cost functional values according to 
elapsed real time with respect to SITPOC algorithm (left) 
and PITPOC algorithm (right). 
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Figure 4. SITPOC algorithm with 4 subdivisions (left) 
and 16 subdivision (right): variation of the number of 
(lower/local) inner-iterations ^max- 



and all e > 
1 d 



2dr 



Mi)llo + H|V.^y(t)||^, 



Sy{t)BSv{t)dt 



< Uewsymi + lwBSvimi], 



(8.1) 
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where V^; denotes the gradient with respect to the space variable. As 5y is 
supposed to satisfies Dirichlet conditions, one can apply Poincare's inequality 
to obtain: 

\\5y{t)\\n<C\\V^5y{t)\\n, 



for a given C > 0. Combining this last estimate with (8.1 1, one gets: 

\j^¥yml < (I - ^) \\5v^)\\l + Y^\\B&v{ml- 

Now, setting e = ^ gives: 

^^\\6y{t)\\l<\\\B5v{t)\\l. 
Since |i(5y(0)||^^ = 0, the resuh follows with the fact that |l;B|l2 < 1. 
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